Asymptotic bias of stochastic gradient search
نویسندگان
چکیده
منابع مشابه
Asymptotic Bias of Stochastic Gradient Search by Vladislav
The asymptotic behavior of the stochastic gradient algorithm using biased gradient estimates is analyzed. Relying on arguments based on dynamic system theory (chain-recurrence) and differential geometry (Yomdin theorem and Lojasiewicz inequalities), upper bounds on the asymptotic bias of this algorithm are derived. The results hold under mild conditions and cover a broad class of algorithms use...
متن کاملExploration of the (Non-)Asymptotic Bias and Variance of Stochastic Gradient Langevin Dynamics
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally infeasible. The recently proposed stochastic gradient Langevin dynamics (SGLD) method circumvents this problem in three ways: it generates proposed moves using only a subset of the data, it skips the Metropolis-Hastings accept-reject step, and it uses sequences of decreasing step sizes. In Teh et ...
متن کاملTowards Faster Stochastic Gradient Search
Stochastic gradient descent is a general algorithm which includes LMS, on-line backpropagation, and adaptive k-means clustering as special cases. The standard choices of the learning rate 1] (both adaptive and fixed functions of time) often perform quite poorly. In contrast, our recently proposed class of "search then converge" learning rate schedules (Darken and Moody, 1990) display the theore...
متن کاملAdapting Bias by Gradient
Appropriate bias is widely viewed as the key to e cient learning and generalization I present a new algorithm the Incremental Delta Bar Delta IDBD algorithm for the learning of appropri ate biases based on previous learning experience The IDBD algorithm is developed for the case of a simple linear learning system the LMS or delta rule with a separate learning rate parameter for each input The I...
متن کاملTrue Asymptotic Natural Gradient Optimization
We introduce a simple algorithm, True Asymptotic Natural Gradient Optimization (TANGO), that converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation. For quadratic models the algorithm is also an instance of averaged stochastic gradient, where the parameter is a moving average of a “fast”, constant-rate gradient descent. TANGO...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Applied Probability
سال: 2017
ISSN: 1050-5164
DOI: 10.1214/16-aap1272